31 research outputs found

    HumaReC: Continuous Data Publishing in the Humanities

    Get PDF
    HumaReC, a Swiss National Foundation project, aims to test a new publication process: continuous data publishing. This model requires research partnerships throughout the process, from the original document source to the publisher and making the data available as soon as it is produced. The HumaReC object of study is a trilingual, 12th century, New Testament manuscript. https://ercim-news.ercim.eu/images/stories/EN111/EN111-web.pd

    Disease association and comparative genomics of compositional bias in human proteins [version 2; peer review: 2 approved]

    Get PDF
    Background: The evolutionary rate of disordered protein regions varies greatly due to the lack of structural constraints. So far, few studies have investigated the presence/absence patterns of compositional bias, indicative of disorder, across phylogenies in conjunction with human disease. In this study, we report a genome-wide analysis of compositional bias association with disease in human proteins and their taxonomic distribution. Methods: The human genome protein set provided by the Ensembl database was annotated and analysed with respect to both disease associations and the detection of compositional bias. The Uniprot Reference Proteome dataset, containing 11297 proteomes was used as target dataset for the comparative genomics of a well-defined subset of the Human Genome, including 100 characteristic, compositionally biased proteins, some linked to disease. Results: Cross-evaluation of compositional bias and disease-association in the human genome reveals a significant bias towards biased regions in disease-associated genes, with charged, hydrophilic amino acids appearing as over-represented. The phylogenetic profiling of 17 disease-associated, proteins with compositional bias across 11297 proteomes captures characteristic taxonomic distribution patterns. Conclusions: This is the first time that a combined genome-wide analysis of compositional bias, disease-association and taxonomic distribution of human proteins is reported, covering structural, functional, and evolutionary properties. The reported framework can form the basis for large-scale, follow-up projects, encompassing the entire human genome and all known gene-disease associations

    Best practices for the manual curation of Intrinsically Disordered Proteins in DisProt

    Full text link
    The DisProt database is a significant resource containing manually curated data on experimentally validated intrinsically disordered proteins (IDPs) and regions (IDRs) from the literature. Developed in 2005, its primary goal was to collect structural and functional information into proteins that lack a fixed three-dimensional (3D) structure. Today, DisProt has evolved into a major repository that not only collects experimental data but also contributes significantly to our understanding of the IDPs/IDRs roles in various biological processes, such as autophagy or the life cycle mechanisms in viruses, or their involvement in diseases (such as cancer and neurodevelopmental disorders). DisProt offers detailed information on the structural states of IDPs/IDRs, including state transitions, interactions, and their functions, all provided as curated annotations. One of the central activities of DisProt is the meticulous curation of experimental data from the literature. For this reason, to ensure that every expert and volunteer curator possesses the requisite knowledge for data evaluation, collection, and integration, training courses and curation materials are available. However, biocuration guidelines concur on the importance of developing robust guidelines that not only provide critical information about data consistency but also ensure data acquisition.This guideline aims to provide both biocurators and external users with best practices for manually curating IDPs and IDRs in DisProt. It describes every step of the literature curation process and provides use cases of IDP curation within DisProt. Database URL: https://disprot.org

    Establishment of computational biology in Greece and Cyprus: Past, present, and future.

    Get PDF
    We review the establishment of computational biology in Greece and Cyprus from its inception to date and issue recommendations for future development. We compare output to other countries of similar geography, economy, and size—based on publication counts recorded in the literature—and predict future growth based on those counts as well as national priority areas. Our analysis may be pertinent to wider national or regional communities with challenges and opportunities emerging from the rapid expansion of the field and related industries. Our recommendations suggest a 2-fold growth margin for the 2 countries, as a realistic expectation for further expansion of the field and the development of a credible roadmap of national priorities, both in terms of research and infrastructure funding

    DisProt in 2022: improved quality and accessibility of protein intrinsic disorder annotation

    Get PDF
    The Database of Intrinsically Disordered Proteins (DisProt, URL: https://disprot.org) is the major repository of manually curated annotations of intrinsically disordered proteins and regions from the literature. We report here recent updates of DisProt version 9, including a restyled web interface, refactored Intrinsically Disordered Proteins Ontology (IDPO), improvements in the curation process and significant content growth of around 30%. Higher quality and consistency of annotations is provided by a newly implemented reviewing process and training of curators. The increased curation capacity is fostered by the integration of DisProt with APICURON, a dedicated resource for the proper attribution and recognition of biocuration efforts. Better interoperability is provided through the adoption of the Minimum Information About Disorder (MIADE) standard, an active collaboration with the Gene Ontology (GO) and Evidence and Conclusion Ontology (ECO) consortia and the support of the ELIXIR infrastructure.Fil: Quaglia, Federica. Università di Padova; Italia. Consiglio Nazionale delle Ricerche; ItaliaFil: Mészáros, Bálint. European Molecular Biology Laboratory; AlemaniaFil: Salladini, Edoardo. Università di Padova; ItaliaFil: Hatos, András. Università di Padova; ItaliaFil: Pancsa, Rita. Research Centre for Natural Sciences; HungríaFil: Chemes, Lucia Beatriz. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Biotecnológicas; ArgentinaFil: Pajkos, Mátyás. Eötvös Loránd University; HungríaFil: Lazar, Tamas. Vlaams Instituut voor Biotechnology; Hungría. Vrije Unviversiteit Brussel; BélgicaFil: Peña Díaz, Samuel. Universitat Autònoma de Barcelona; EspañaFil: Santos, Jaime. Universitat Autònoma de Barcelona; EspañaFil: Ács, Veronika. Research Centre for Natural Sciences; HungríaFil: Farahi, Nazanin. Vlaams Instituut voor Biotechnology; Bélgica. Vrije Unviversiteit Brussel; BélgicaFil: Fichó, Erzsébet. Research Centre for Natural Sciences; HungríaFil: Aspromonte, Maria Cristina. Università di Padova; Italia. Città della Speranza Pediatric Research Institute; ItaliaFil: Bassot, Claudio. Stockholms Universitet; SueciaFil: Chasapi, Anastasia. Centre for Research & Technology Hellas; GreciaFil: Davey, Norman E.. Chester Beatty Laboratories; Reino UnidoFil: Davidović, Radoslav. University of Belgrade; SerbiaFil: Laszlo Holland, Alicia Verónica. European Molecular Biology Laboratory; Alemania. Research Centre for Natural Sciences; HungríaFil: Elofsson, Arne. Stockholms Universitet; SueciaFil: Erdős, Gábor. Eötvös Loránd University; HungríaFil: Gaudet, Pascale. Swiss Institute of Bioinformatics; SuizaFil: Giglio, Michelle. University of Maryland School of Medicine; Estados UnidosFil: Glavina, Juliana. Universidad Nacional de San Martín. Instituto de Investigaciones Biotecnológicas. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Biotecnológicas; ArgentinaFil: Iserte, Javier Alonso. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigaciones Bioquímicas de Buenos Aires. Fundación Instituto Leloir. Instituto de Investigaciones Bioquímicas de Buenos Aires; ArgentinaFil: Iglesias, Valentín. Universitat Autònoma de Barcelona; EspañaFil: Kálmán, Zsófia. Pázmány Péter Catholic University; HungríaFil: Lambrughi, Matteo. Danish Cancer Society Research Center; DinamarcaFil: Leonardi, Emanuela. Università di Padova; Italia. Pediatric Research Institute Città della Speranza; ItaliaFil: Rodriguez Sawicki, Luciana. Universidad Nacional de Quilmes. Departamento de Ciencia y Tecnología; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    DisProt: intrinsic protein disorder annotation in 2020

    Get PDF
    The Database of Protein Disorder (DisProt, URL: https://disprot.org) provides manually curated annotations of intrinsically disordered proteins from the literature. Here we report recent developments with DisProt (version 8), including the doubling of protein entries, a new disorder ontology, improvements of the annotation format and a completely new website. The website includes a redesigned graphical interface, a better search engine, a clearer API for programmatic access and a new annotation interface that integrates text mining technologies. The new entry format provides a greater flexibility, simplifies maintenance and allows the capture of more information from the literature. The new disorder ontology has been formalized and made interoperable by adopting the OWL format, as well as its structure and term definitions have been improved. The new annotation interface has made the curation process faster and more effective. We recently showed that new DisProt annotations can be effectively used to train and validate disorder predictors. We believe the growth of DisProt will accelerate, contributing to the improvement of function and disorder predictors and therefore to illuminate the ‘dark’ proteome

    Critical assessment of protein intrinsic disorder prediction

    Get PDF
    Abstract: Intrinsically disordered proteins, defying the traditional protein structure–function paradigm, are a challenge to study experimentally. Because a large part of our knowledge rests on computational predictions, it is crucial that their accuracy is high. The Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment was established as a community-based blind test to determine the state of the art in prediction of intrinsically disordered regions and the subset of residues involved in binding. A total of 43 methods were evaluated on a dataset of 646 proteins from DisProt. The best methods use deep learning techniques and notably outperform physicochemical methods. The top disorder predictor has Fmax = 0.483 on the full dataset and Fmax = 0.792 following filtering out of bona fide structured regions. Disordered binding regions remain hard to predict, with Fmax = 0.231. Interestingly, computing times among methods can vary by up to four orders of magnitude

    HumaReC : humanities research and continuous publishing : a digital New Testament test-case

    No full text
    How is Humanities research reshaped and transformed by the digital rhythm of data production and publication ? HumaReC investigates this question using the test-case of the edition of a unique trilingual Greek, Latin and Arabic New Testament manuscript; it is supported by a SNF grant (169869). Humanities research is reshaped and transformed by the digital rhythm of data production and publication. The challenge of the project is to develop a new model of Humanities research that affords a continuous publication of data and analysis as well as the evaluation of experts . In this perspective, HumaReC offers several levels of publications. The manuscript viewer offers images and transcriptions that will be continuously updated with new folios. The research blog will regularly provide postings of research results and input regarding the manuscript and the digital development of the project. Communicate with HumaReC! Continuous publishing offers the possibility of an involving document that can change in response to discussions and reactions. We encourage you to comment on the blog and to discuss in the forum. The web book offers a completely new format for the Humanities: it will present research results peer-reviewed by an editorial board of experts and by the publisher Brill, change and grow over the course of the project period and be able to respond quickly to advancements in Humanities research. http://www.helveticat.ch/lib/item?id=chamo:1893712 http://permalink.snl.ch/bib/sz00189371
    corecore